Content-Based Classi cation and Retrieval of Audio
نویسندگان
چکیده
An online audio classiication and segmentation system is presented in this research, where audio recordings are classiied and segmented into speech, music, several types of environmental sounds and silence based on audio content analysis. This is the rst step of our continuing work towards a general content-based audio classiication and retrieval system. The extracted audio features include temporal curves of the energy function, the average zero-crossing rate, the fundamental frequency of audio signals, as well as statistical and morphological features of these curves. The classiication result is achieved through a threshold-based heuristic procedure. The audio database that we have built, details of feature extraction, classiication and segmentation procedures, and experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classiied into basic types in real time with an accuracy of over 90%. Outlines of further classiication of audio into ner types and a query-by-example audio retrieval system on top of the coarse classiication are also introduced.
منابع مشابه
Hierarchical System for Content-based Audio Classi cation and Retrieval
A hierarchical system for audio classi cation and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The audio recordings are rst classi ed and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal curves of the energy function, the average zero-crossin...
متن کاملHierarchical classification of audio data for archiving and retrieving
A hierarchical system for audio classi cation and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The rst stage is called the coarse-level audio classi cation and segmentation, where audio recordings are classi ed and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical...
متن کاملClassification of general audio data for content-based retrieval
In this paper, we address the problem of classi®cation of continuous general audio data (GAD) for content-based retrieval, and describe a scheme that is able to classify audio segments into seven categories consisting of silence, single speaker speech, music, environmental noise, multiple speakers' speech, simultaneous speech and music, and speech and noise. We studied a total of 143 classi®cat...
متن کاملSpeaker Identiication for Audio Indexing Applications
So a Tsekeridou and Ioannis Pitas Dept. of Informatics, Aristotle Univ. of Thessaloniki, Box 451, Thessaloniki 54006, GREECE Tel: +30 31 996304, Fax: +30 31 996304, e-mail: [email protected] ABSTRACT A method for identifying di erent speakers from an audio source of continuous speech is described in this paper aiming at extracting the speaker sequence, timing information and speaker identi...
متن کاملA Similarity Measure for Automatic Audio Classi cation
This paper presents recent results using statistics generated by a MMI-supervised vector quantizer as a measure of audio similarity. Such a measure has proved successful for talker identi cation, and the extension from speech to general audio, such as music, is straightforward. A classi er that distinguishes speech from music and non-vocal sounds is presented, as well as experimental results sh...
متن کامل